KEYWORD EXTRACTION FROM A SINGLE DOCUMENT USING WORD CO-OCCURRENCE STATISTICAL INFORMATION
نویسندگان
چکیده
منابع مشابه
Keyword Extraction from a Single Document using Word Co-occurrence Statistical Information
We present a new keyword extraction algorithm that applies to a single document without using a corpus. Frequent terms are extracted first, then a set of cooccurrence between each term and the frequent terms, i.e., occurrences in the same sentences, is generated. Co-occurrence distribution shows importance of a term in the document as follows. If probability distribution of co-occurrence betwee...
متن کاملKeyword Extraction from a Single Document Using Centrality Measures
Keywords characterize the topics discussed in a document. Extracting a small set of keywords from a single document is an important problem in text mining. We propose a hybrid structural and statistical approach to extract keywords. We represent the given document as an undirected graph, whose vertices are words in the document and the edges are labeled with a dissimilarity measure between two ...
متن کاملSegmented Spoken Document Retrieval Using Word Co-occurrence Information
This paper shows several approaches for NTCIR-11 SpokenQuery&Doc [1]. This paper proposes several schemes to use word co-occurrence information for spoken document retrieval. Automatic transcriptions of spoken documents usually contain mis-recognized words, making the performance of spoken document retrieval signi cantly decrease. The cosine similarity to measure a document similarity must be i...
متن کاملSingle Document Keyphrase Extraction Using Label Information
Keyphrases have found wide ranging application in NLP and IR tasks such as document summarization, indexing, labeling, clustering and classification. In this paper we pose the problem of extracting label specific keyphrases from a document which has document level metadata associated with it namely labels or tags (i.e. multi-labeled document). Unlike other, supervised or unsupervised, methods f...
متن کاملDocument Clustering Using Co-word Analysis and Formation of Keyword against Document Matrix
A complexity of the retrieval of relevant document from a large corpus of documents is the most common challenging problem in the areas of web mining and search engines. In addition, the growth of unlabelled and unsupervised documents are also increases this complexity. Document clustering algorithms plays a vital role to reduce this problem. In this paper, an algorithm was proposed to cluster ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal on Artificial Intelligence Tools
سال: 2004
ISSN: 0218-2130,1793-6349
DOI: 10.1142/s0218213004001466